Efficient Temporal Mean Shift for Activity Recognition in Video

نویسندگان

  • Yan Ke
  • Rahul Sukthankar
  • Martial Hebert
چکیده

We propose a temporal mean shift algorithm that clusters spatio-temporal regions in video by exploiting the temporal nature of video. Extracting spatio-temporal regions is often one of the first pre-processing steps in an activity recognition system. Our key contribution is the insight that mean shift clustering can exploit the fact that there is typically very little change between successive video frames. Most of the pixels, and therefore the clusters, shift only slightly from frame to frame. Since mean shift is an iterative procedure, fewer iterations are required for convergence if the initial search is already close to the local optimum. Our temporal mean shift algorithm exploits the temporal similarity between successive frames by initializing the search using the modes found in the previous frame. Standard Mean Shift Clustering •Finds local modes in the clusters of points •Iterative gradient ascent procedure •Let fi be a d-dimensional point in a set of n feature points f1...fn. We calculate where j = 1, 2, ..., with kernel g, typically a Gaussian kernel, and bandwidth h. •The mode is the limit of the series, where y(i,j) converges to a fixed point. •The inner loop of mean shift requires a range-search of neighbors fk near y(i,j) , which is particularly time-consuming in high dimensions. •Previous methods in optimizing standard mean shift focused on reducing the time needed for range-search of neighbors. Temporal Mean Shift Clustering •We observe that in most videos, there is very little change between successive frames. Video sequence

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recognition of Visual Events using Spatio-Temporal Information of the Video Signal

Recognition of visual events as a video analysis task has become popular in machine learning community. While the traditional approaches for detection of video events have been used for a long time, the recently evolved deep learning based methods have revolutionized this area. They have enabled event recognition systems to achieve detection rates which were not reachable by traditional approac...

متن کامل

Hand Gesture Recognition from RGB-D Data using 2D and 3D Convolutional Neural Networks: a comparative study

Despite considerable enhances in recognizing hand gestures from still images, there are still many challenges in the classification of hand gestures in videos. The latter comes with more challenges, including higher computational complexity and arduous task of representing temporal features. Hand movement dynamics, represented by temporal features, have to be extracted by analyzing the total fr...

متن کامل

Applying mean shift and motion detection approaches to hand tracking in sign language

Hand gesture recognition is very important to communicate in sign language. In this paper, an effective object tracking and hand gesture recognition method is proposed. This method is combination of two well-known approaches, the mean shift and the motion detection algorithm. The mean shift algorithm can track objects based on the color, then when hand passes the face occlusion happens. Several...

متن کامل

An Efficient Hierarchical Modulation based Orthogonal Frequency Division Multiplexing Transmission Scheme for Digital Video Broadcasting

Due to the increase of users the efficient usage of spectrum plays an important role in digital terrestrial television networks. In digital video broadcasting, local and global content are transmitted by single frequency network and multifrequency network respectively. Multifrequency network support transmission of global content and it consumes large spectrum. Similarly local content are well ...

متن کامل

An Efficient Adaptive Boundary Matching Algorithm for Video Error Concealment

Sending compressed video data in error-prone environments (like the Internet and wireless networks) might cause data degradation. Error concealment techniques try to conceal the received data in the decoder side. In this paper, an adaptive boundary matching algorithm is presented for recovering the damaged motion vectors (MVs). This algorithm uses an outer boundary matching or directional tempo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005